NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FPGA-Accelerated Range-Limited Molecular Dynamics

https://doi.org/10.1109/TC.2024.3375613

Wu, Chunshu; Yang, Chen; Bandara, Sahan; Geng, Tong; Guo, Anqi; Haghi, Pouya; Li, Ang; Herbordt, Martin (June 2024, IEEE Transactions on Computers)

Full Text Available
SmartFuse: Reconfigurable Smart Switches to Accelerate Fused Collectives in HPC Applications

https://doi.org/10.1145/3650200.3656616

Haghi, Pouya; Tan, Cheng; Guo, Anqi; Wu, Chunshu; Liu, Dongfang; Li, Ang; Skjellum, Anthony; Geng, Tong; Herbordt, Martin (May 2024, ACM)

Full Text Available
Software-Hardware Co-design of Heterogeneous SmartNIC System for Recommendation Models Inference and Training

https://doi.org/10.1145/3577193.3593724

Guo, Anqi; Hao, Yuchen; Wu, Chunshu; Haghi, Pouya; Pan, Zhenyu; Si, Min; Tao, Dingwen; Li, Ang; Herbordt, Martin; Geng, Tong (June 2023, The 37th ACM International Conference on Supercomputing (ICS 2023))

Full Text Available
FLASH: FPGA-Accelerated Smart Switches with GCN Case Study

https://doi.org/10.1145/3577193.3593739

Haghi, Pouya; Krska, William; Tan, Cheng; Geng, Tong; Chen, Po Hao; Greenwood, Connor; Guo, Anqi; Hines, Thomas; Wu, Chunshu; Li, Ang; et al (June 2023, ICS 2023: International Conference on Supercomputing)

Full Text Available
H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture

https://doi.org/10.1109/FPL57034.2022.00040

Zhang, Chengming; Geng, Tong; Guo, Anqi; Tian, Jiannan; Herbordt, Martin; Li, Ang; Tao, Dingwen (August 2022, 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL 2022))

Graph Neural Networks (GNNs) have drawn tremendous attention due to their unique capability to extend Machine Learning (ML) approaches to applications broadly-defined as having unstructured data, especially graphs. Compared with other Machine Learning (ML) modalities, the acceleration of Graph Neural Networks (GNNs) is more challenging due to the irregularity and heterogeneity derived from graph typologies. Existing efforts, however, have focused mainly on handling graphs’ irregularity and have not studied their heterogeneity. To this end we propose H-GCN, a PL (Programmable Logic) and AIE (AI Engine) based hybrid accelerator that leverages the emerging heterogeneity of Xilinx Versal Adaptive Compute Acceleration Platforms (ACAPs) to achieve high-performance GNN inference. In particular, H-GCN partitions each graph into three subgraphs based on its inherent heterogeneity, and processes them using PL and AIE, respectively. To further improve performance, we explore the sparsity support of AIE and develop an efficient density-aware method to automatically map tiles of sparse matrix-matrix multiplication (SpMM) onto the systolic tensor array. Compared with state-of-the-art GCN accelerators, H-GCN achieves, on average, speedups of 1.1∼2.3×.
more » « less
Full Text Available
A Framework for Neural Network Inference on FPGA-Centric SmartNICs

https://doi.org/10.1109/FPL57034.2022.00071

Guo, Anqi; Geng, Tong; Zhang, Yongan; Haghi, Pouya; Wu, Chunshu; Tan, Cheng; Lin, Yingyan; Li, Ang; Herbordt, Martin (August 2022, 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL))

Full Text Available
FCsN: A FPGA-Centric SmartNIC Framework for Neural Networks

https://doi.org/10.1109/FCCM53951.2022.9786193

Guo, Anqi; Geng, Tong; Zhang, Yongan; Haghi, Pouya; Wu, Chunshu; Tan, Cheng; Lin, Yingyan; Li, Ang; Herbordt, Martin (May 2022, 30th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM))

Full Text Available
Reconfigurable switches for high performance and flexible MPI collectives

https://doi.org/10.1002/cpe.6769

Haghi, Pouya; Guo, Anqi; Xiong, Qingqing; Yang, Chen; Geng, Tong; Broaddus, Justin T.; Marshall, Ryan; Schafer, Derek; Skjellum, Anthony; Herbordt, Martin C. (March 2022, Concurrency and Computation: Practice and Experience)

Full Text Available
A Reconfigurable Compute-in-the-Network FPGA Assistant for High-Level Collective Support with Distributed Matrix Multiply Case Study

https://doi.org/10.1109/ICFPT51103.2020.00030

Haghi, Pouya; Guo, Anqi; Geng, Tong; Broaddus, Justin; Schafer, Derek; Skjellum, Anthony; Herbordt, Martin (December 2020, IEEE Conference on Field Programmable Technology)
null (Ed.)
Full Text Available
FP-AMG: FPGA-Based Acceleration Framework for Algebraic Multigrid Solvers

https://doi.org/10.1109/FCCM48280.2020.00028

Haghi, Pouya; Geng, Tong; Guo, Anqi; Wang, Tianqi; Herbordt, Martin (May 2020, 28th IEEE International Symposium on Field-Programmable Custom Computing Machines)

Full Text Available

Search for: All records